Dynamicity vs. Effectiveness: A User Study of a Clustering Algorithm for Scatter/Gather
نویسندگان
چکیده
We proposed and implemented a novel clustering algorithm called LAIR2, which has linear worst-case time complexity and constant running time average for on-the-fly Scatter/Gather browsing [4]. Our previous experiments showed that when running on a single processor, the LAIR2 on-line clustering algorithm was several hundred times faster than the parallel Buckshot algorithm running on multiple processors [11]. This paper reports on a study that examined the effectiveness of the LAIR2 algorithm in terms of clustering quality and its impact on retrieval performance. We conducted a user study on 24 subjects to evaluate on-the-fly LAIR2 clustering in Scatter/Gather search tasks by comparing its performance to the Buckshot algorithm, a classic method for Scatter/Gather browsing [4]. Results showed significant differences in terms of subjective perceptions of clustering quality. Subjects perceived that the LAIR2 algorithm produced significantly better quality clusters than the Buckshot method did. Subjects felt that it took less effort to complete the tasks with the LAIR2 system, which was more effective in helping them in the tasks. Interesting patterns also emerged from the subjects’ comments in the final open-ended questionnaire. We discuss the implications and future research.
منابع مشابه
Parallel and Distributed Scatter-Gather Clustering System Development Proposal
From the process of scatter-gather algorithm explained above, we can easily find the essence of the parallel version of this algorithm is the parallel clustering algorithm used in the scatter phase. Frieder, et al. implements a parallel version of the buckshot clustering algorithm [1]. Their work meets the need of the parallel scatter-gather clustering algorithm pretty well, although we can des...
متن کاملRepeated Record Ordering for Constrained Size Clustering
One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...
متن کاملHybrid ANFIS with ant colony optimization algorithm for prediction of shear wave velocity from a carbonate reservoir in Iran
Shear wave velocity (Vs) data are key information for petrophysical, geophysical and geomechanical studies. Although compressional wave velocity (Vp) measurements exist in almost all wells, shear wave velocity is not recorded for most of elderly wells due to lack of technologic tools. Furthermore, measurement of shear wave velocity is to some extent costly. This study proposes a novel methodolo...
متن کاملA Fast Online Clustering Algorithm for Scatter/Gather Browsing
We present a fast online clustering algorithm which has linear worst-case time complexity and constant running time average for the well-known online visually oriented browsing modeling called Scatter/Gather browsing (Cutting, Karger, Pedersen, and Tukey 1992). Our experiment shows when running on a single processor, this fast online clustering algorithm is few hundred times faster than the par...
متن کاملEvolutionary User Clustering Based on Time-Aware Interest Changes in the Recommender System
The plenty of data on the Internet has created problems for users and has caused confusion in finding the proper information. Also, users' tastes and preferences change over time. Recommender systems can help users find useful information. Due to changing interests, systems must be able to evolve. In order to solve this problem, users are clustered that determine the most desirable users, it pa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008